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Abstract 

Background: The human AP0BEC3G (A3G) protein activity is associated with innate immunity against HIV-1 by 
inducing high rates of guanosines to adenosines (G-to-A) mutations [viz., hypermutation) in the viral DNA. If 
hypermutation is not enough to disrupt the reading frames of viral genes, it may likely increase the HIV-1 diversity. 
To counteract host innate immunity HIV-1 encodes the Vif protein that binds A3G protein and form complexes to 
be degraded by cellular proteolysis. 

Methods: Here we studied the pattern of substitutions in the vif gene and its association with clinical status of 
HIV-1 infected individuals. To perform the study, unique vif gene sequences were generated from 400 
antiretroviral-naive individuals. 

Results: The codon pairs: 78-1 54, 85-1 54, 1 01 -1 57, 1 05-1 57, and 1 05-1 76 of vif gene were associated with CD4+ T 
cell count lower than 500 cells per mm 3 . Some of these codons were located in the 81 LGQGVSIEW 89 region and 
within the BC-Box. We also identified codons under positive selection clustered in the N-terminal region of Vif 
protein, between 21 WKSLVK 26 and 40 YRHHY 44 regions {i.e., 31, 33, 37, 39), within the BC-Box {i.e., 155, 159) and the 
Cullin5-Box {i.e., 168) of vif gene. All these regions are involved in the Vif-induced degradation of A3G/F complexes 
and the N-terminal of Vif protein binds to viral and cellular RNA. 

Conclusions: Adaptive evolution of vif gene was mostly to optimize viral RNA binding and A3G/F recognition. 
Additionally, since there is not a fully resolved structure of the Vif protein, codon pairs associated with CD4+ T cell 
count may elucidate key regions that interact with host cell factors. Here we identified and discriminated codons 
under positive selection and codons under functional constraint in the vif gene of HIV-1. 
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Background 

The APOBEC (apolipoprotein B mRNA-editing catalytic 
polypeptide) gene family includes several members, 
APOBEC1, APOBEC2, APOBEC3 and APOBEC4 that 
have cytidine deaminase activity [1-3]. Notably, two 
genes of APOBEC3 (APOBEC3G and APOBEC3F) have 
been linked to innate immunity, and their ability to 
restrain retroviral infections has been widely recognized 
[4,5]. APOBEC3G (A3G) induces cytidine deamination 
(C— >U) in the negative strand of HIV-1 during reverse 
transcription, hence inducing substitutions of guanosines 
for adenosines (G— >A) in the positive strand of the viral 
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DNA. This mechanism is known as hypermutation and 
may cause the appearance of stop codons followed by 
the complete loss of reading frames of the viral genes. 
HIV-1 counteracts A3G activity through ubiquitination 
of this host protein through the activity of Vif proteins. 
Vif proteins assemble with viral-specific E3 ubiquitin ligase 
through its interaction with cellular Cullin5 (Cul5)- 
ElonginB-ElonginC proteins, inducing ubiquitination of 
A3G and consequent degradation by the proteasomal 
complex [6-10]. Hypermutation is not enough to curb 
HIV-1 infection because proviruses with varying amounts 
of G— >A mutations are commonly observed in the host 
genome [11-13]. Furthermore, the polymorphisms in 
the vif gene have been associated with more or less 
efficacy to neutralize A3G [5]. For these reasons, it has 
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been hypothesized that when G— >A mutations are ineffect- 
ive in neutralizing viral genomes, the side effect is that A3G 
can actually promote HIV-1 diversification [14,15]. 

The interaction between the cellular A3G and the 
vif gene of HIV likely emerged from a process of co- 
adaptation due to recurrent retroviral infections during the 
evolutionary history of primates [2,16]. Specifically, the 
repeated encounters with retroviruses probably promoted 
the fixation of the allelic variants in the genes of the human 
family of the APOBEC [17-20]. 

Recently, we found that A3G polymorphisms are 
mostly unrelated with CD4+ T cell counts of HIV-1 
infected Brazilians [21]. Thus, population-based studies 
may provide conflicting results regarding the overall 
effect of A3G-vif interactions [5,21-23]. To gain more 
insights on the A3G-HIV interaction, we used codon-based 
approaches to determine the function of amino acid substi- 
tutions of Vif protein. The study was made through the 
analysis of 400 vif gene sequences obtained from HIV-1 
infected drug-naive individuals. 

Methods 

HIV-1 infected individuals 

This study was approved by the Ethics Committee of the 
Federal University of Sao Paulo and by the Brazilian 
Ministry Health; all biological samples were obtained in 
full accordance with signed informed consent forms. 

DNA samples 

Proviral DNA was extracted from heparinized peripheral 
blood obtained from 400 HIV-1 infected individuals that 
were drug-naive (not receiving any antiretroviral therapy) 
and asymptomatic when samples were collected. From 
each patient one unique sequence of HIV-1 w/gene was 
generated, then our study focused on the diversity of the 
virus at the population level. The study group represented 
almost equally the male (55.5%) and female (45.5%) popu- 
lations and was composed of three ethnics groups: white 
(49.2%), mulatto (41.6%) and black (9.2%) individuals. The 
CD4 counts (cells/mm 3 ) ranged from 20 to 5362 and the 
virus load ranged from 80 to 7.8 x 10 7 (RNA copies/ml of 
plasma). HIV-1 -infected individuals sampled from Sao 
Paulo city between 1989 and 2006 comprised our target 
population. These individuals were enrolled in the AIDS 
program of the Brazilian Ministry of Health. 

PCR and sequencing of the vif gene of HIV-1 

The vif sequence was amplified by a nested PCR. The 
primers were designed to cover the entire vif gene, 
according to the reference sequence HXB2 (HIV Sequence 
Database). The first round was performed with the primers, 
Platinum Taq DNA Polymerase, 10X Reaction Buffer, 
MgCl2 (Invitrogen, USA) and deoxyribonucleotide triphos- 
phates (dNTP; GE Healthcare, USA). The second-round 



PCR was carried out using 5 ul of the first-round product 
and internal primers. Amplified vif DNA was purified and 
then sequenced using the BigDye Terminator kit, version 
3.1 (Applied Biosystems/Perkin Elmer, Foster City, CA). 
The samples were electrophoresed on an ABI 3130 genetic 
analyzer, and the sequencing data were analyzed using ABI 
software Sequencing Analysis Software. 

Sequences Analysis. Nucleotide and protein sequence 
analyses and edits were performed using the 
Sequencher DNA Sequence Assembly Software (Gene 
Codes Corporation, USA). 

Hypermutation detection in the integrase gene of HIV-1 

We used a previously described approach to detect the 
presence of hypermutation in the PCR products of the 
integrase gene of HIV-1 [24]. Briefly, the PCR products 
were initially analyzed on 1% agarose gels to confirm 
amplification. After that, a second electrophoresis was 
performed with HA yellow (9 uL/mL) incorporated into 
the agarose gel solution at 65°C and pH 7.5. The electro- 
phoresis was performed at 80 V in 0.5 x Tris-borate-EDTA 
(TBE) for 150 min. The HA yellow gel was visualized 
after immersion in a solution of ethidium bromide, 
using the Geldoc-it TS Imaging Systems Biolmaging 
(UVP, Cambridge, CA, EUA). HA yellow is a compound 
consisting of the DNA ligand, bisbenzamide, covalently 
linked to polyethylene glycol (PEG) (Resolve-It Kit - Vector 
Laboratories, Burlingname, CA, USA). Bisbenzamide binds 
preferentially to AT-rich regions in the DNA and, when 
coupled to PEG, retards DNA mobility during gel elec- 
trophoresis according to the AT content. We used three 
distinct samples that independently amplified as negative 
(no hypermutation) and positive controls (hypermutated). 
The hypermutation statuses of the controls were confirmed 
by bacterial cloning followed by sequencing. 

Sequence alignment and phylogenetic inference 

Initially, the sequences of the vif gene of HIV-1 were 
aligned using the ClustalX program [25]. Sequences 
with stop codons and hypermutations were excluded 
from the analyses. We used the Hypermut software 
(http://www.hiv.lanl.gov/content/sequence/HYPERMUT/ 
hypermut.html). 

After this editing process, the sequences were manually 
aligned using the SE-AL program, version 2.0 (Department 
of Zoology, Oxford University; http://evolve.zoo.ox.ac.uk/ 
software/). To construct maximum likelihood (ML) trees, 
we used the HKY model [26], as implemented in the 
PhyML software [27]. These trees were used mainly to the 
selective regimen analysis. 

Association of HIV vif gene and CD4+ cell counts 

We investigated whether individual codons or pairs of 
codons in w/gene were associated with levels of CD4+ 
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T cell counts. To do that linear regression and permutation 
tests were used. The log-transformed CD4 counts were 
regressed on the amino acids or amino acid pairs. To 
account for multiplicity, we generated 1000 sets of 
samples under the null hypothesis of no association by 
permuting the CD4+T counts. The ^-values obtained by 
the log likelihood ratio statistics were contrasted with the 
null distribution of minimum ^-values among amino acid 
positions with SNPs and pairs of these positions. 

Covariation among codons based on phylogenies 

A Bayesian Graph method (BGM) was used to explore 
covariation among amino acids in codons of the vifgene 
taking into account the phylogenetic information of the 
sequences [28]. Therefore, BGM considers the potential 
bias due to the founder effect and relaxes the assumption 
of pairwise associations. BMG reconstructs the maximum 
likelihood of evolutionary history of the extant sequences, 
and then it analyzes the joint probability distribution of 
substitution events among sites in the sequences through 
a Bayesian graph model. The method was used to detect 
co-evolving sites in vif. The analyses were performed 
assuming a GTR model [29], and sites with a marginal 
posterior probability of 0.85 were considered to be under 
epistasis. The analysis was performed on the Datamonkey 
web server (http://www.datamonkey.org). 

Detection of selective pressure 

We used a codon-based maximum likelihood method to 
estimate the selection pressures of the vif sequences. 
This approach estimates the likelihood of distinct models of 
codon evolution and computes the ratio (d N /d s =u) of the 
number of nonsynonymous (d N ) and synonymous (d s ) sub- 
stitution rates between sites considering the phylogenetic 
relationships of the sequences. 

The nonsynonymous/synonymous rate ratio (co) de- 
termines selective pressures at protein level. When 
selection (neutral) has no effect on the fitness the 
nonsynonymous and the synonymous mutations will 
occur at the same rates {d N =d s ). 

Situations where nonsynonymous mutations are dele- 
terious, negative (purifying) selection will reduce their 
rate of fixation (d N <d s ). If nonsynonymous mutations 
raise the fitness, their rate will be increased by positive 
selection (d N >d s ). 

We used the following codon models. The one-ratio 
model (MO) assumes a single co for all sites in the 
alignment and is the simplest model. The neutral model 
(Ml) allows for different proportions of conserved sites 
(co 0 =0) and neutral sites (co^l), both estimated from the 
data. Model 1 is the null hypothesis to test for positive 
selection. The selection model (M2) extends Ml and 
incorporates an additional class of sites with co ratios 
assuming values higher than one (co 2 > 1). Significant 



evidence for positive selection is provided if M2 signifi- 
cantly reject the null hypotheses, MO and Ml, and if the 
favored models contain a class of codons with co > 1. 
Statistical significance can be compared using a standard 
likelihood ratio test (LRT). These models are implemented 
in the CODEML program from the PAML v.4 package 
(http://abacus.gene.ucl.ac.uk/software/paml.html) [30] . 

Results 

Diversity of vif gene 

To characterize the sequences of the HIV-1 vif gene 
from Brazilians, we analyzed the nucleotide and amino 
acid substitutions on a site-by-site basis. The overall 
nucleotide distance of 235 subtype B isolates in the 
alignment of 581 nucleotides was 0.031±0.004. The 
translated Vif sequence of 192 amino acids identified 22 
singletons, 53 conserved and 138 variable sites. In general 
the amino acid composition was relatively conserved 
among subtypes in Brazil and all regions with biological 
functions were equally conserved. The genetic diversity 
was estimated assuming the HKY85 model and the analysis 
were performed using Mega 4.0 software [31]. 

Pairs of codons in vif gene associated with CD4+ cells 

The regression analysis indicated that no single amino 
acid positions in vif gene were significantly associated 
with the CD4+T counts. However, when we analyzed the 
impact of pairs of codons in the levels of CD4+T cells, 
the epistatic effects of five pairs of amino acids (i.e., 78- 
154, 85-154, 101-157, 105-157, and 105-176 pairs) were 
detected at a 5% significance level after correction for 
multiplicity (orange dashed lines in the Figure 1). Notably, 
most combinations of amino acids in these epistatic sites 
tend to be associated with CD4+ T cell counts below 500 
cells per mm 3 (see Figure 2 for a detailed description of 
pairs of residues and their correlation with CD4 counts). 
We used a proposed three-dimensional computational 
model of Vif [32] (PDB: 1VZF) to shown the location of 
the pairs of epistatic codons on the structure of this viral 
protein (Figure 3). 

Coevolving sites in the vif gene 

By using a posterior probability of 0.85, the BGM analysis 
detected three pairs of codons (i.e., 80-83, 80-86 and 
83-144) where amino acids were coevolving in vifgene. 
These sites were not the same epistatic sites identified 
by regression/permutation, although they were concen- 
trated in a specific genomic region of vif between sites 
78 to 86 and within the BC-box (see blue dashed lines 
in the Figure 1 and magenta dotted lines in the Figure 3). 
However when we reduced the threshold of the posterior 
distribution to 0.5, various other sites were indicated 
to be under epistasis, including those identified by the 
permutation analysis. 
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Figure 1 Consensus sequences of the vif gene of HIV-1. The consensus was obtained from an alignment of 71 sequences containing only subtype 
B viruses. Open and filled diamonds indicate sites under positive selection. The figure shows positively selected codons detected in non-recombinant 
(filled diamonds) sequences and detected in hypermutated viruses (open diamonds). The regions linked to Vif protein functions are highlighted in dark 
grey with white letters within. The BC-Box and the Cullin5-Box are also highlighted (yellow). Dashed grey lines mark regions where cytotoxic T 
lymphocytes (CTL) epitopes were previously identified in the Vif protein (http://www.hiv.lanl.gov/content/immunology/maps/maps.html). Orange ovals 
linked by dashed lines indicate epistatic sites associated with CD4+ cell counts detected by permutation test. Blue circles linked by dashed lines indicate 
co-evolving sites detected by BGM analysis. 



Adaptive mutations in the vif gene 

To explore the selective regimen acting on the vif gene 
of the subtype B lineage of Brazilian isolates, we used a 
codon-based model to estimate the dN/dS ratio. Recom- 
bination affects the reliability of likelihood ratio test 
(LRT) to discriminate models of positive selection [33], 
then we decided to analyze selective forces in vif 
sequences that have not recombined. To do that we used 
a Bayesian approach [34] to identify recombination-free 
sequences. These sequences were compiled in a data set 
composed of seventy-one (n=71) recombination-free iso- 
lates that were edited in order to exclude sequences with 
stop codons. The estimated likelihoods of model M2 
(-/=6882.348) indicated that the hypothesis of positive 
selection in the evolution of the vif gene was effectively 
accepted to the detriment of the null hypothesis of neu- 
tral evolution based on the likelihood of the Ml model 
(-/=6985.462) (x 2 =206.2277, /kO.0001, with two degree 
of freedom). In detail, the M2 (selection) model detected 
57.0% of codons with (0=0.049, 35.5% with co=1.0 and 
7.5% with (0=3.7. Thus, most sites of the vif gene evolved 
under purifying selection. The sites under positive selec- 
tion (Le., 31, 33, 37, 39, 47, 63, 92, 127, 159 and 168) 
were mapped onto the consensus of vif sequences from 
the subtype B lineage (Figure 1). Positively selected sites 
were distributed along the extension of the gene; codons 
under positive selection were also detected within regions 
that have important biological functions, such as the 



BC-Box and Cullin5-Box (filled diamonds in the Figure 1). 
In addition, a dataset of vif sequences (n=33) from 
hypermutated viruses, based on the integrase gene, was 
analyzed to explore the selective regime. According to the 
M2, most sites (53.6%) evolved under purifying selection 
with (0=0.035, 37.2% were conserved (<o=l) codons, and 
9.2% were under strong positive selection with (0=4.063. 
These sites under positive selection were exactly the 
same as those detected in the recombination-free data 
(open diamonds in the Figure 3). Mapping positively 
selected sites in the Vif protein sequence and in the 3D 
structure of a computational model of Vif protein [32] 
(PDB: 1VZF), revealed they were concentrated between 
the 21 WKSLVK 26 and 40 YRHHY 44 motif {i.e., 31, 33, 37, 
39 and 47), both important to Vif- induced degradation of 
A3G/F complexes. It is important to mention that the 
N-terminal region of Vif protein binds selectively to 
HIV-1 genomic RNA [35] and mRNA of A3G [36]. In 
addition, within the BC-Box and Cullin5-Box, we also 
detected codons under high selective pressure (Le., 155, 159 
and 168) (Figure 1 and Figure 4). Positively selected sites 
may indicate adaptive substitutions that usually evolve as a 
consequence of the host immune response against viral 
proteins. According to the Los Alamos HIV immunology 
database, cytotoxic T-lymphocyte (CTL) epitopes have been 
previously identified in nearly all regions of Vif protein 
(http://www.hiv.lanl.gov/content/immunology/maps/maps. 
html/). We then concatenated these CTL epitopes to show 
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Figure 2 (See legend on next page.) 
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(See figure on previous page.) 

Figure 2 Epistatic codons and CD4 levels. Upper panel of each page shows the correlation between epistatic pairs and CD4+ T cell 
counts. The x-axis depicts the distinct combinations of amino acids (pairs), y-axis shows the levels of CD4+ T cell counts measured as 
number of cell per mm3 of plasma. Lower panel shows the location of epistatic sites (orange dots) on the structure of a computational 
model (Balaji et al., Bioinformation. 2006 Dec 6;1 :290-309[PMID: 17597910], PDB=1VZF) of Vif Protein of HIV-1 . Yellow regions depict the 
SOCS BOX (BC Box+Cullin Box) and grey regions are A3 binding sites (see text for more details). Visualization and edition of structures were 
done using PyMOL software (http://www.pymol.org). 



them in the consensus sequence of vif gene (dotted line in 
the Figure 1). Our results indicate that the codons under 
positive selection were not always located within the CTL 
epitopes, whereas other CTL regions were lacking 
positively selected codons. Indeed, signatures of posi- 
tive selection by CTL or antibody immune pressure are 
a host-specific mechanism that is rarely identified by 
population-based analyses [37-39]. 



Discussion 

While hypermutation induced by A3G activity is a natural 
barrier against retroviruses it is not enough to restrain HIV- 
1 infection. Sometimes, A3G activity can actually increase 
HIV-1 diversification [14,15] because G-to-A hypermutation 
is not always effective to neutralize all viral genomes within 
a specific host. Our results suggest that the diversity in the 
HIV-1 vZ/'gene is highly associated with adaptation to the 




Figure 3 Epistatic and co-evolving codons in the structure of Vif protein. 3D structure of a computational model proposed to the Vif 
protein of HIV-1 (PDB: 1VZF). Epistatic codons are depicted in distinct colors and dashed lines link codon pairs. Co-evolving codons are depicted 
as magenta sticks in the structure and magenta dashed lines are connecting them. Regions that interact with Apobec3 protein were highlighted 
in white color in the Vif structure. The BC-Box and the Cullin5-Box were highlighted in yellow color. Panels A, B and C correspond to distinct 
rotating angles of the structure of Vif. 
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Figure 4 Positively selected codons in the structure of vif protein. Codons under positive selection were mapped in structure of Vif protein 
of HIV-1. Blue spheres indicate locations of positively selected codons. Cyan area designates the BC-Box and the Cullin5-Box. Grey areas indicate 
regions of Vif that bind to Apobec3 protein. 



host proteins, mainly to increase interaction with cellular 
components (i.e., elongins and A3G and A3F) to induce 
APOBEC3 proteasomal degradation. 

Particularly, codons under positive selection were 
more concentrated in a region between the 21 WKSLVK 26 
and 40 YRHHY 44 motifs (Le., 31, 33, 37 and 39). Interest- 
ingly, the N-terminal region of Vif protein binds selectively 
HIV-1 genomic RNA [1] and sites in this region have 
DNA/RNA binding properties and also interact with A3G/ 
F [35,36]. Additionally, a study showed that the charge of 
amino acids located between 21 WKSLVK 26 and ^YRHHY 44 
motifs that are essential for maintaining the ability of vif 
to bind A3G [40]. Furthermore, it has been shown that 
the N-terminal region of Vif protein is highly structured, 
in contrast to the unstructured and flexible C-terminal 
[41-43]. Likely, the organized N-terminal structure of Vif 
functions as a connector that binds to A3G/F proteins 



and DNA/DNA molecules. On the other hand, positively 
selected sites detected in the C-terminal region of Vif pro- 
tein were more dispensed. They were found within the 
BC-Box and Cullin-Box (Le., 127), which both assemble 
with cellular components to induce A3G proteasomal 
degradation [6,43-45] . Positive selection was also detected 
in the vicinity of the PPLP motif (Le., 159), which controls 
multimerization of Vif proteins [46]. We also found one 
codon under positive selection (i.e., 168) in a region of Vif 
protein involved in the interaction with Gag, NCp7 and 
with the cellular membrane [47]. It is worth to note that 
in the N-terminal region of Vif protein sites under positive 
selection are clustered between the 21 WKSLVK 26 and 
40 YRHHY 44 motifs whereas in the C-terminal they tend to 
be dispersed (see Figure 1). Since Vif protein is highly 
structured at the N-terminal region, contrasting with the 
unstructured C-terminal [41,42]. Therefore the N-terminal 
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portion of Vif protein tends to be more protected while 
the C-terminal is solvent exposed and prone to immune 
recognition. Consequently adaptive evolution in vif gene 
could be related with the host immune surveillance against 
viral proteins. However, there are various positively selected 
sites outside CTL epitope regions. Additionally, wide vif 
sequence intervals in which many CTL epitopes have been 
empirically detected show no evidence of positive selection. 
Furthermore, selection driven by antibody evasion or host 
cell adaptation is rarely detected by population-based 
analysis [38,48,49]. Indeed, our results showing a distinct 
pattern of distribution of positively selected sites between 
N and C terminals of Vif protein mirrors the structural 
organization of this viral protein. For these reasons, posi- 
tive selection detected in vif codons likely emerged as an 
adaptive response to optimize HIV-1 RNA recognition 
and neutralization of A3G/F in the population. 

The comparison of amino acids of vif sequences 
revealed a limited variability in regions related with 
A3G/F activity, such as the regions 14 DRMR 17 and 
40 YRHHY 44 , which are important for vif-induced deg- 
radation of A3G [40,50]. This conservation of vif mo- 
tifs may indicate a significant evolutionary constraint 
that has been operating on this viral gene even among 
distinct lineages. Indeed, we found that most codons 
(60%) of vif gene are predominantly under purifying 
selection, and perhaps this pattern is needed to pre- 
serve its biological function during the viral life cycle. 
Likewise, HIV-1 nef gene is similarly under strong purifying 
selection [37,51]. Nevertheless, Nef is a multifunctional 
protein, and this feature can be observed by its 
plasticity, represented by extensive polymorphism and 
amino acid length variations that can be detected both in 
population samples and in the viral population within a 
single individual as well. 

In addition, an attempt was made to establish the 
influence of the patients' statuses on the selective regimen 
of HIV-1. In a previous population-level studies, we 
observed that CD4+ T cell counts higher than 200 
cells/ ul were associated with increased dN/dS values in 
the env gene of HIV-1 subtype B [48,49,52]. For this 
reason, we measured the intensity of positive selection 
in the vif gene from datasets categorized into three distinct 
levels of CD4 counts (>200; 200-400 and <400). We found 
no difference in the mean dN/dS among these data sets 
(3.65, 3.85 and 3.09 respectively). 

Perhaps our most remarkable finding was the identifica- 
tion of five pairs of codons (Le., 78-154, 85-154, 101-157, 
105-157, and 105-176 pairs) in the vif gene and their 
association with CD4+ cell levels lower than 500 cells 
per mm 3 . In each pair (epistatic codons) distinct amino 
acids combination were associated with distinct levels of 
CD4+ cells (see Figure 2 for a details). Notably, these 
pairs of codons were located mainly in the C-terminal 



of Vif protein (see Figure 1). It has been shown that the 
mutation 105 QLI 107 to 105 AAV 107 reduces the infectivity 
of HIV by 2% [53]. The amino acids between the 154th 
and 157th positions of the vif gene comprise the BC-box, 
the region that binds cellular elongin B and C to form 
complexes that trigger the ubiquitination and proteasomal 
degradation of the A3G proteins [44]. Since codons 154 
and 157 are located in the alpha-helix of the BC-box, it is 
likely that certain amino acids in these sites may affect the 
interaction with the cellular elongin B and C complex and 
thereby affect the efficacy of Vif-induced A3G proteasomal 
degradation. In addition, the 161 PPLP 164 motif is funda- 
mental to vif multimerization and interaction with cellular 
proteins [41,44,46]. Remarkably, proline-to-alanine substi- 
tutions in the 161 PPLP 164 motif have no effect to the vif 
structure although it decreased the ability this protein to 
form oligomers [41]. These findings suggest that domains 
in the C-terminal of Vif protein fold independently of each 
other and the flexibility of these domains is required to 
interact directly with distinct cellular counterparts [41,42]. 
Thus, we postulate that epistatic effect observed in pairs 
of codons, indicate electrostatic interaction of certain pairs 
of amino acids required to Vif activity. 

The presence of co-evolving sites was further investigated 
using a Bayesian graph model that explores associations 
between codon sites and accounts for the phylogenetic 
sign of the sequences. The results indicated that amino 
acids at sites 80-83, 80-86 and 83-144 of vif co-evolve 
in phylogenies constructed with vif gene of the HIV-1. 
Interestingly, although both methods did not indicate 
the same sites, these results corroborate the identifica- 
tion of a region between sites 78 to 86 of HIV-1 vif gene 
that has many sites co-evolving with codons located 
within the BC-box. 

Conclusion 

The host-virus interaction between A3G and vif are 
likely to affect AIDS in many instances. Conversely, the 
adaptive evolution in the HIV-1 vif gene is mainly 
explained by a response optimized to neutralize A3G 
activity. Co-evolution detected in some codons suggests 
that regions of the Vif protein are highly constrained 
and may have important function to the virus activity. 
Here, we identified and discriminated codons under 
positive selection and codons under functional constraint 
in the vif gene of HIV-1. 
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